Skip to content

Feature/52 vla fine tuning#53

Merged
robertnishihara merged 1 commit intomainfrom
feature/52-vla-fine-tuning
Apr 7, 2026
Merged

Feature/52 vla fine tuning#53
robertnishihara merged 1 commit intomainfrom
feature/52-vla-fine-tuning

Conversation

@shorbaji
Copy link
Copy Markdown
Contributor

Add vla fine tuning template

@shorbaji shorbaji marked this pull request as ready for review March 27, 2026 13:49
@robertnishihara robertnishihara force-pushed the feature/52-vla-fine-tuning branch 3 times, most recently from 7c55ba2 to 530a0e1 Compare April 7, 2026 02:12
Fine-tunes the PI0.5 Vision-Language-Action model on a LeRobot robotics
dataset stored in S3, using Ray Data for CPU preprocessing and Ray Train
for distributed GPU training.

Key features:
- Streams LeRobot v3 (parquet + mp4) data from S3 via a custom Ray Data
  datasource with anonymous S3 access
- Preprocesses on CPU workers (rename cameras, HWC->CHW, /255 normalise)
  so GPU workers are never blocked on I/O or video decoding
- Expert-only fine-tuning: only the 4 action/time projection heads are
  trained; the PaliGemma backbone stays frozen
- BF16 mixed precision throughout (no GradScaler needed)
- Linear-warmup + cosine-decay LR schedule
- Fault-tolerant checkpointing via ray.train.report()
- Declarative, cloud-agnostic compute config targeting 8x L40S GPUs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Robert Nishihara <rkn@anyscale.com>
@robertnishihara robertnishihara merged commit dca0393 into main Apr 7, 2026
@robertnishihara robertnishihara deleted the feature/52-vla-fine-tuning branch April 7, 2026 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants